首页> 外文OA文献 >Massively Parallel Sort-Merge Joins in Main Memory Multi-Core Database Systems
【2h】

Massively Parallel Sort-Merge Joins in Main Memory Multi-Core Database Systems

机译:主内存多核数据库中的大规模并行排序合并连接   系统

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Two emerging hardware trends will dominate the database system technology inthe near future: increasing main memory capacities of several TB per server andmassively parallel multi-core processing. Many algorithmic and controltechniques in current database technology were devised for disk-based systemswhere I/O dominated the performance. In this work we take a new look at thewell-known sort-merge join which, so far, has not been in the focus of researchin scalable massively parallel multi-core data processing as it was deemedinferior to hash joins. We devise a suite of new massively parallel sort-merge(MPSM) join algorithms that are based on partial partition-based sorting.Contrary to classical sort-merge joins, our MPSM algorithms do not rely on ahard to parallelize final merge step to create one complete sort order. Ratherthey work on the independently created runs in parallel. This way our MPSMalgorithms are NUMA-affine as all the sorting is carried out on local memorypartitions. An extensive experimental evaluation on a modern 32-core machinewith one TB of main memory proves the competitive performance of MPSM on largemain memory databases with billions of objects. It scales (almost) linearly inthe number of employed cores and clearly outperforms competing hash joinproposals - in particular it outperforms the "cutting-edge" Vectorwise parallelquery engine by a factor of four.
机译:在不久的将来,两种新兴的硬件趋势将主导数据库系统技术:将每服务器几TB的主存储器容量增加,并进行大规模并行多核处理。当前的数据库技术中的许多算法和控制技术都是为基于磁盘的系统设计的,其中I / O主导了性能。在这项工作中,我们对众所周知的排序合并联接进行了新的研究,到目前为止,由于它被认为不如哈希联接,因此尚未成为可扩展大规模并行多核数据​​处理研究的重点。我们设计了一套新的基于部分基于分区的排序的大规模并行排序合并(MPSM)连接算法,与经典的排序合并连接相反,我们的MPSM算法不依赖于难于并行化最终合并步骤来创建一个完整的排序顺序。相反,对独立创建的工作并行运行。这样,我们的MPSM算法是NUMA仿射的,因为所有排序都是在本地内存分区上进行的。在具有1 TB主内存的现代32核计算机上进行的广泛实验评估证明,MPSM在具有数十亿个对象的大型主内存数据库上具有竞争优势。它在使用的内核数量上(几乎)线性扩展,并且明显优于竞争的哈希联接建议-特别是它比“尖端” Vectorwise并行查询引擎的性能高四倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号